300 research outputs found
Archiving the Relaxed Consistency Web
The historical, cultural, and intellectual importance of archiving the web
has been widely recognized. Today, all countries with high Internet penetration
rate have established high-profile archiving initiatives to crawl and archive
the fast-disappearing web content for long-term use. As web technologies
evolve, established web archiving techniques face challenges. This paper
focuses on the potential impact of the relaxed consistency web design on
crawler driven web archiving. Relaxed consistent websites may disseminate,
albeit ephemerally, inaccurate and even contradictory information. If captured
and preserved in the web archives as historical records, such information will
degrade the overall archival quality. To assess the extent of such quality
degradation, we build a simplified feed-following application and simulate its
operation with synthetic workloads. The results indicate that a non-trivial
portion of a relaxed consistency web archive may contain observable
inconsistency, and the inconsistency window may extend significantly longer
than that observed at the data store. We discuss the nature of such quality
degradation and propose a few possible remedies.Comment: 10 pages, 6 figures, CIKM 201
RobustMQ: Benchmarking Robustness of Quantized Models
Quantization has emerged as an essential technique for deploying deep neural
networks (DNNs) on devices with limited resources. However, quantized models
exhibit vulnerabilities when exposed to various noises in real-world
applications. Despite the importance of evaluating the impact of quantization
on robustness, existing research on this topic is limited and often disregards
established principles of robustness evaluation, resulting in incomplete and
inconclusive findings. To address this gap, we thoroughly evaluated the
robustness of quantized models against various noises (adversarial attacks,
natural corruptions, and systematic noises) on ImageNet. The comprehensive
evaluation results empirically provide valuable insights into the robustness of
quantized models in various scenarios, for example: (1) quantized models
exhibit higher adversarial robustness than their floating-point counterparts,
but are more vulnerable to natural corruptions and systematic noises; (2) in
general, increasing the quantization bit-width results in a decrease in
adversarial robustness, an increase in natural robustness, and an increase in
systematic robustness; (3) among corruption methods, \textit{impulse noise} and
\textit{glass blur} are the most harmful to quantized models, while
\textit{brightness} has the least impact; (4) among systematic noises, the
\textit{nearest neighbor interpolation} has the highest impact, while bilinear
interpolation, cubic interpolation, and area interpolation are the three least
harmful. Our research contributes to advancing the robust quantization of
models and their deployment in real-world scenarios.Comment: 15 pages, 7 figure
Oil Saturation Boundary for Partial Oil and Partial Water Recognition in the Oil-Water Transition Zone
With the development of oilfield, the oil reserves in oil-water transition zone has become a significant part of comprehensive reserves gradually. Especially the partial oil layer of oil-water transition zone has potential exploitation. But how to identify partial oil layer has become a difficulty in the development planning of the oil-water transition zone. Over the years, there has been little research on the oil-water transition. The oil saturation boundaries for partial oil and partial water recognition are mainly studied in this paper. Two major approaches, theoretical calculation methods and cumulative probability curve have been applied to the study. That will provide the basis for further perforation development and dynamic adjustment.Key words: The oil-water transition zone; Partial oil layer; Oil saturation; Theoretical calculation methods; Cumulative probability curv
Dynamic Quality Metric Oriented Error-bounded Lossy Compression for Scientific Datasets
With the ever-increasing execution scale of high performance computing (HPC)
applications, vast amounts of data are being produced by scientific research
every day. Error-bounded lossy compression has been considered a very promising
solution to address the big-data issue for scientific applications because it
can significantly reduce the data volume with low time cost meanwhile allowing
users to control the compression errors with a specified error bound. The
existing error-bounded lossy compressors, however, are all developed based on
inflexible designs or compression pipelines, which cannot adapt to diverse
compression quality requirements/metrics favored by different application
users. In this paper, we propose a novel dynamic quality metric oriented
error-bounded lossy compression framework, namely QoZ. The detailed
contribution is three-fold. (1) We design a novel highly-parameterized
multi-level interpolation-based data predictor, which can significantly improve
the overall compression quality with the same compressed size. (2) We design
the error-bounded lossy compression framework QoZ based on the adaptive
predictor, which can auto-tune the critical parameters and optimize the
compression result according to user-specified quality metrics during online
compression. (3) We evaluate QoZ carefully by comparing its compression quality
with multiple state-of-the-arts on various real-world scientific application
datasets. Experiments show that, compared with the second-best lossy
compressor, QoZ can achieve up to 70% compression ratio improvement under the
same error bound, up to 150% compression ratio improvement under the same PSNR,
or up to 270% compression ratio improvement under the same SSIM
EvLog: Evolving Log Analyzer for Anomalous Logs Identification
Software logs record system activities, aiding maintainers in identifying the
underlying causes for failures and enabling prompt mitigation actions. However,
maintainers need to inspect a large volume of daily logs to identify the
anomalous logs that reveal failure details for further diagnosis. Thus, how to
automatically distinguish these anomalous logs from normal logs becomes a
critical problem. Existing approaches alleviate the burden on software
maintainers, but they are built upon an improper yet critical assumption:
logging statements in the software remain unchanged. While software keeps
evolving, our empirical study finds that evolving software brings three
challenges: log parsing errors, evolving log events, and unstable log
sequences.
In this paper, we propose a novel unsupervised approach named Evolving Log
analyzer (EvLog) to mitigate these challenges. We first build a multi-level
representation extractor to process logs without parsing to prevent errors from
the parser. The multi-level representations preserve the essential semantics of
logs while leaving out insignificant changes in evolving events. EvLog then
implements an anomaly discriminator with an attention mechanism to identify the
anomalous logs and avoid the issue brought by the unstable sequence. EvLog has
shown effectiveness in two real-world system evolution log datasets with an
average F1 score of 0.955 and 0.847 in the intra-version setting and
inter-version setting, respectively, which outperforms other state-of-the-art
approaches by a wide margin. To our best knowledge, this is the first study on
tackling anomalous logs over software evolution. We believe our work sheds new
light on the impact of software evolution with the corresponding solutions for
the log analysis community
FOCAL: Contrastive Learning for Multimodal Time-Series Sensing Signals in Factorized Orthogonal Latent Space
This paper proposes a novel contrastive learning framework, called FOCAL, for
extracting comprehensive features from multimodal time-series sensing signals
through self-supervised training. Existing multimodal contrastive frameworks
mostly rely on the shared information between sensory modalities, but do not
explicitly consider the exclusive modality information that could be critical
to understanding the underlying sensing physics. Besides, contrastive
frameworks for time series have not handled the temporal information locality
appropriately. FOCAL solves these challenges by making the following
contributions: First, given multimodal time series, it encodes each modality
into a factorized latent space consisting of shared features and private
features that are orthogonal to each other. The shared space emphasizes feature
patterns consistent across sensory modalities through a modal-matching
objective. In contrast, the private space extracts modality-exclusive
information through a transformation-invariant objective. Second, we propose a
temporal structural constraint for modality features, such that the average
distance between temporally neighboring samples is no larger than that of
temporally distant samples. Extensive evaluations are performed on four
multimodal sensing datasets with two backbone encoders and two classifiers to
demonstrate the superiority of FOCAL. It consistently outperforms the
state-of-the-art baselines in downstream tasks with a clear margin, under
different ratios of available labels. The code and self-collected dataset are
available at https://github.com/tomoyoshki/focal.Comment: Code available at: [github](https://github.com/tomoyoshki/focal
- …